Data Stream Algorithms

نویسنده

  • S. Muthukrishnan
چکیده

How does one deal with massive data sets that is available for analyses? We will describe the classical data stream model in which we make one pass over the data and with sublinear resources perform much of the data analyses we care about, such as frequent items, summaries, compressed sensing, clustering and others. We will present the basic algorithmic techniques used to build the sophisticated analyses above. In addition, we will present extensions to other problems (graph, matrix, statistics) and to other models (probabilistic models, parallel models such as Google’s MapReduce). One of the reasons this area of research thrives is its immediate application to a number of scenarios, which we will describe.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Stream ciphers and the eSTREAM project

Stream ciphers are an important class of symmetric cryptographic algorithms. The eSTREAM project contributed significantly to the recent increase of activity in this field. In this paper, we present a survey of the eSTREAM project. We also review recent time/memory/data and time/memory/key trade-offs relevant for the generic attacks on stream ciphers.

متن کامل

Application of Data-Mining Algorithms in the Sensitivity Analysis and Zoning of Areas Prone to Gully Erosion in the Indicator Watersheds of Khorasan Razavi Province

Extended abstract 1- Introduction Gully erosion is one of the most important sources of sediment in the watersheds and a common phenomenon in semi-arid climate that affects vast areas with different morphological, soil and climatic conditions. This type of erosion is very dangerous due to the transfer of fertile soil horizons, and the reduction of water holding capacity also is a factor for s...

متن کامل

Data Stream Clustering Algorithms: A Review

Data stream mining has become a research area of some interest in recent years. The key challenge in data stream mining is extracting valuable knowledge in real time from a massive, continuous, dynamic data stream in only a single scan. Clustering is an efficient tool to overcome this problem. Data stream clustering can be applied in various fields such as financial transactions, telephone reco...

متن کامل

Optimization of sediment rating curve coefficients using evolutionary algorithms and unsupervised artificial neural network

Sediment rating curve (SRC) is a conventional and a common regression model in estimating suspended sediment load (SSL) of flow discharge. However, in most cases the data log-transformation in SRC models causing a bias which underestimates SSL prediction. In this study, using the daily stream flow and suspended sediment load data from Shalman hydrometric station on Shalmanroud River, Guilan Pro...

متن کامل

A Stream Database Server for Sensor Applications

We present a framework for stream data processing that incorporates a stream database server as a fundamental component. The server operates as the stream control interface between arrays of distributed data stream sources and end-user clients that access and analyze the streams. The underlying framework provides novel stream management and query processing mechanisms to support the online acqu...

متن کامل

Study of Clear Air Turbulence over Iranian Plato

This study was carried out using two sets of numerical weather forecast data and flight reports for Clear Air Turbulence (CAT) over Iranian Plato to find atmospheric flow patterns favorable to the formation of CAT. The numerical data include five months of AVN analysis with horizontal resolution of 1 degree(about 100 km) and four months forecast data of MM5 model with resolution of 50 km. Impor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007